Parsimonious Linear Fingerprinting for Time Series
نویسندگان
چکیده
We study the problem of mining and summarizing multiple time series effectively and efficiently. We propose PLiF, a novel method to discover essential characteristics (“fingerprints”), by exploiting the joint dynamics in numerical sequences. Our fingerprinting method has the following benefits: (a) it leads to interpretable features; (b) it is versatile: PLiF enables numerous mining tasks, including clustering, compression, visualization, forecasting, and segmentation, matching top competitors in each task; and (c) it is fast and scalable, with linear complexity on the length of the sequences. We did experiments on both synthetic and real datasets, including human motion capture data (17MB of human motions), sensor data (166 sensors), and network router traffic data (18 million raw updates over 2 years). Despite its generality, PLiF outperforms the top clustering methods on clustering; the top compression methods on compression (3 times better reconstruction error, for the same compression ratio); it gives meaningful visualization and at the same time, enjoys a linear scale-up.
منابع مشابه
Analysis of Fast Input Selection: Application in Time Series Prediction
In time series prediction, accuracy of predictions is often the primary goal. At the same time, however, it would be very desirable if we could give interpretation to the system under study. For this goal, we have devised a fast input selection algorithm to choose a parsimonious, or sparse set of input variables. The method is an algorithm in the spirit of backward selection used in conjunction...
متن کاملWhich Methodology is Better for Combining Linear and Nonlinear Models for Time Series Forecasting?
Both theoretical and empirical findings have suggested that combining different models can be an effective way to improve the predictive performance of each individual model. It is especially occurred when the models in the ensemble are quite different. Hybrid techniques that decompose a time series into its linear and nonlinear components are one of the most important kinds of the hybrid model...
متن کاملFitting of Count Time Series Models on the Number of Patients Referred to Addiction Treatment Centers in Semnan County
Abstract. Count data over time are observed in many application areas. Many researchers use time series patterns to analyze this data. In this paper, the poisson count time series linear models and negative binomials on this type of data with the explanatory variables are studied. The Likelihood analysis and the evaluation of count time series model based on generalized linear models are pres...
متن کاملSequential input selection algorithm for long-term prediction of time series
In time series prediction, making accurate predictions is often the primary goal. At the same time, interpretability of the models would be desirable. For the latter goal, we have devised a sequential input selection algorithm (SISAL) to choose a parsimonious, or sparse, set of input variables. Our proposed algorithm is a sequential backward selection type algorithm based on a cross-validation ...
متن کاملModeling and prediction of time-series of monthly copper prices
One of the main tasks to analyze and design a mining system is predicting the behavior exhibited by prices in the future. In this paper, the applications of different prediction methods are evaluated in econometrics and financial management fields, such as ARIMA, TGARCH, and stochastic differential equations, for the time-series of monthly copper prices. Moreover, the performance of these metho...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PVLDB
دوره 3 شماره
صفحات -
تاریخ انتشار 2010